AITopics | central machine

Collaborating Authors

central machine

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Distributed Newton Can Communicate Less and Resist Byzantine Workers

Neural Information Processing SystemsDec-24-2025, 16:20:49 GMT

We develop a distributed second order optimization algorithm that is communication-efficient as well as robust against Byzantine failures of the worker machines. We propose an iterative approximate Newton-type algorithm, where the worker machines communicate \emph{only once} per iteration with the central machine. This is in sharp contrast with the state-of-the-art distributed second order algorithms like GIANT \cite{giant}, DINGO\cite{dingo}, where the worker machines send (functions of) local gradient and Hessian sequentially; thus ending up communicating twice with the central machine per iteration. Furthermore, we employ a simple norm based thresholding rule to filter-out the Byzantine worker machines. We establish the linear-quadratic rate of convergence of our proposed algorithm and establish that the communication savings and Byzantine resilience attributes only correspond to a small statistical error rate for arbitrary convex loss functions. To the best of our knowledge, this is the first work that addresses the issue of Byzantine resilience in second order distributed optimization. Furthermore, we validate our theoretical results with extensive experiments on synthetically generated and benchmark LIBSVM \cite{libsvm} data-set and demonstrate convergence guarantees.

name change, proceedings, resist byzantine worker, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

High-Dimensional Differentially Private Quantile Regression: Distributed Estimation and Statistical Inference

Shen, Ziliang, Wang, Caixing, Wang, Shaoli, Yan, Yibo

arXiv.org Machine LearningAug-8-2025

With the development of big data and machine learning, privacy concerns have become increasingly critical, especially when handling heterogeneous datasets containing sensitive personal information. Differential privacy provides a rigorous framework for safeguarding individual privacy while enabling meaningful statistical analysis. In this paper, we propose a differentially private quantile regression method for high-dimensional data in a distributed setting. Quantile regression is a powerful and robust tool for modeling the relationships between the covariates and responses in the presence of outliers or heavy-tailed distributions. To address the computational challenges due to the non-smoothness of the quantile loss function, we introduce a Newton-type transformation that reformulates the quantile regression task into an ordinary least squares problem. Building on this, we develop a differentially private estimation algorithm with iterative updates, ensuring both near-optimal statistical accuracy and formal privacy guarantees. For inference, we further propose a differentially private debiased estimator, which enables valid confidence interval construction and hypothesis testing. Additionally, we propose a communication-efficient and differentially private bootstrap for simultaneous hypothesis testing in high-dimensional quantile regression, suitable for distributed settings with both small and abundant local data. Extensive simulations demonstrate the robustness and effectiveness of our methods in practical scenarios.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2508.05212

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Distributed Newton Can Communicate Less and Resist Byzantine Workers

Neural Information Processing SystemsOct-11-2024, 10:29:06 GMT

algorithm, byzantine resilience, resist byzantine worker, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

Finding Decision Tree Splits in Streaming and Massively Parallel Models

Pham, Huy, Ta, Hoang, Vu, Hoa T.

arXiv.org Artificial IntelligenceApr-17-2024

In this work, we provide data stream algorithms that compute optimal splits in decision tree learning. In particular, given a data stream of observations $x_i$ and their labels $y_i$, the goal is to find the optimal split point $j$ that divides the data into two sets such that the mean squared error (for regression) or misclassification rate (for classification) is minimized. We provide various fast streaming algorithms that use sublinear space and a small number of passes for these problems. These algorithms can also be extended to the massively parallel computation model. Our work, while not directly comparable, complements the seminal work of Domingos and Hulten (KDD 2000).

algorithm, misclassification rate, optimal split, (16 more...)

arXiv.org Artificial Intelligence

2403.19867

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

DDAC-SpAM: A Distributed Algorithm for Fitting High-dimensional Sparse Additive Models with Feature Division and Decorrelation

He, Yifan, Wu, Ruiyang, Zhou, Yong, Feng, Yang

arXiv.org Artificial IntelligenceJul-8-2023

Distributed statistical learning has become a popular technique for large-scale data analysis. Most existing work in this area focuses on dividing the observations, but we propose a new algorithm, DDAC-SpAM, which divides the features under a high-dimensional sparse additive model. Our approach involves three steps: divide, decorrelate, and conquer. The decorrelation operation enables each local estimator to recover the sparsity pattern for each additive component without imposing strict constraints on the correlation structure among variables. The effectiveness and efficiency of the proposed algorithm are demonstrated through theoretical analysis and empirical results on both synthetic and real data. The theoretical results include both the consistent sparsity pattern recovery as well as statistical inference for each additive functional component. Our approach provides a practical solution for fitting sparse additive models, with promising applications in a wide range of domains.

artificial intelligence, exp, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2205.07932

Country:

North America > United States > New York (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Distributed Nonparametric Function Estimation: Optimal Rate of Convergence and Cost of Adaptation

Cai, T. Tony, Wei, Hongji

arXiv.org Machine LearningJun-30-2021

Distributed minimax estimation and distributed adaptive estimation under communication constraints for Gaussian sequence model and white noise model are studied. The minimax rate of convergence for distributed estimation over a given Besov class, which serves as a benchmark for the cost of adaptation, is established. We then quantify the exact communication cost for adaptation and construct an optimally adaptive procedure for distributed estimation over a range of Besov classes. The results demonstrate significant differences between nonparametric function estimation in the distributed setting and the conventional centralized setting. For global estimation, adaptation in general cannot be achieved for free in the distributed setting. The new technical tools to obtain the exact characterization for the cost of adaptation can be of independent interest.

communication constraint, communication cost, estimation, (16 more...)

arXiv.org Machine Learning

2107.00179

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Distributed function estimation: adaptation using minimal communication

Szabo, Botond, van Zanten, Harry

arXiv.org Machine LearningMar-28-2020

Distributed methods have attracted a lot of attention in the statistics and machine learning communities recently. There are several reasons for this, the most prominent ones being that they provide a way of dealing with large datasets and with privacy considerations. The theoretical literature on distributed methods is still rather minimal at the moment. A number of papers have recently investigated fundamental performance limits in distributed models, in particular pointing out issues that occur in high-dimensional or nonparametric problems, see for instance [1, 2, 4, 8, 16, 17, 21, 24, 27]. For example, optimal rates in distributed function estimation depend on the amount of communication that is allowed, and the relation of that amount with the regularity of the unknown function. The lower bounds obtained in [25] and [28] and the subsequent adaptation results in [25] show that in particular, automatically adapting to the smoothness of the unknown function is a complicated issue in communication restricted distributed settings. In the present paper we study this problem from a different, we think relevant and interesting perspective, not restricting communication a priori, but asking for rate-optimal procedures that require minimal communication.

communication, logn, theorem 2, (16 more...)

arXiv.org Machine Learning

2003.12838

Country:

Europe > Netherlands > South Holland > Leiden (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms

Cai, T. Tony, Wei, Hongji

arXiv.org Machine LearningJan-23-2020

We study distributed estimation of a Gaussian mean under communication constraints in a decision theoretical framework. Minimax rates of convergence, which characterize the tradeoff between the communication costs and statistical accuracy, are established in both the univariate and multivariate settings. Communication-efficient and statistically optimal procedures are developed. In the univariate case, the optimal rate depends only on the total communication budget, so long as each local machine has at least one bit. However, in the multivariate case, the minimax rate depends on the specific allocations of the communication budgets among the local machines. Although optimal estimation of a Gaussian mean is relatively simple in the conventional setting, it is quite involved under the communication constraints, both in terms of the optimal procedure design and lower bound argument. The techniques developed in this paper can be of independent interest. An essential step is the decomposition of the minimax estimation problem into two stages, localization and refinement. This critical decomposition provides a framework for both the lower bound analysis and optimal procedure design.

communication constraint, estimation, log 1, (11 more...)

arXiv.org Machine Learning

2001.08877

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Asia > Middle East > Jordan (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Communication-Efficient and Byzantine-Robust Distributed Learning

Ghosh, Avishek, Maity, Raj Kumar, Kadhe, Swanand, Mazumdar, Arya, Ramchandran, Kannan

arXiv.org Machine LearningNov-21-2019

We develop a communication-efficient distributed learning algorithm that is robust against Byzantine worker machines. We propose and analyze a distributed gradient-descent algorithm that performs a simple thresholding based on gradient norms to mitigate Byzantine failures. We show the (statistical) error-rate of our algorithm matches that of [YCKB18], which uses more complicated schemes (like coordinate-wise median or trimmed mean) and thus optimal. Furthermore, for communication efficiency, we consider a generic class of {\delta}-approximate compressors from [KRSJ19] that encompasses sign-based compressors and top-k sparsification. Our algorithm uses compressed gradients and gradient norms for aggregation and Byzantine removal respectively. We establish the statistical error rate of the algorithm for arbitrary (convex or non-convex) smooth loss function. We show that, in the regime when the compression factor {\delta} is constant and the dimension of the parameter space is fixed, the rate of convergence is not affected by the compression operation, and hence we effectively get the compression for free. Moreover, we extend the compressed gradient descent algorithm with error feedback proposed in [KRSJ19] for the distributed setting. We have experimentally validated our results and shown good performance in convergence for convex (least-square regression) and non-convex (neural network training) problems.

algorithm, compressor, gradient, (15 more...)

arXiv.org Machine Learning

1911.09721

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Structure Learning of Sparse GGMs over Multiple Access Networks

Tavassolipour, Mostafa, Karamzade, Armin, Mirzaeifard, Reza, Motahari, Seyed Abolfazl, Shalmani, Mohammad-Taghi Manzuri

arXiv.org Machine LearningDec-26-2018

A central machine is interested in estimating the underlying structure of a sparse Gaussian Graphical Model (GGM) from datasets distributed across multiple local machines. The local machines can communicate with the central machine through a wireless multiple access channel. In this paper, we are interested in designing effective strategies where reliable learning is feasible under power and bandwidth limitations. Two approaches are proposed: Signs and Uncoded methods. In Signs method, the local machines quantize their data into binary vectors and an optimal channel coding scheme is used to reliably send the vectors to the central machine where the structure is learned from the received data. In Uncoded method, data symbols are scaled and transmitted through the channel. The central machine uses the received noisy symbols to recover the structure. Theoretical results show that both methods can recover the structure with high probability for large enough sample size. Experimental results indicate the superiority of Signs method over Uncoded method under several circumstances.

central machine, local machine, matrix, (16 more...)

arXiv.org Machine Learning

1812.10437

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
North America > United States > California (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Telecommunications (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback